AITopics | value layer

Collaborating Authors

value layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

a102dd5931da01e1b40205490513304c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 01:43:51 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models

Neural Information Processing SystemsOct-10-2025, 11:46:46 GMT

Diffusion models (DMs) produce very detailed and high-quality images.

memorization, memorization neuron, neuron, (14 more...)

Neural Information Processing Systems

Country: Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.57)

Add feedback

Optimised Grouped-Query Attention Mechanism for Transformers

Chen, Yuang, Zhang, Cheng, Gao, Xitong, Mullins, Robert D., Constantinides, George A., Zhao, Yiren

arXiv.org Artificial IntelligenceJun-21-2024

Grouped-query attention (GQA) has been widely adopted in LLMs to mitigate the complexity of multi-head attention (MHA). To transform an MHA to a GQA, neighbour queries in MHA are evenly split into groups where each group shares the value and key layers. In this work, we propose AsymGQA, an activation-informed approach to asymmetrically grouping an MHA to a GQA for better model performance. Our AsymGQA outperforms the GQA within the same model size budget. For example, AsymGQA LLaMA-2-7B has an accuracy increase of 7.5% on MMLU compared to neighbour grouping. Our approach addresses the GQA's trade-off problem between model performance and hardware efficiency.

asymgqa, group size, similarity, (12 more...)

arXiv.org Artificial Intelligence

2406.14963

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models

Hintersdorf, Dominik, Struppek, Lukas, Kersting, Kristian, Dziedzic, Adam, Boenisch, Franziska

arXiv.org Artificial IntelligenceJun-4-2024

Diffusion models (DMs) produce very detailed and high-quality images. Their power results from extensive training on large amounts of data, usually scraped from the internet without proper attribution or consent from content creators. Unfortunately, this practice raises privacy and intellectual property concerns, as DMs can memorize and later reproduce their potentially sensitive or copyrighted training images at inference time. Prior efforts prevent this issue by either changing the input to the diffusion process, thereby preventing the DM from generating memorized samples during inference, or removing the memorized data from training altogether. While those are viable solutions when the DM is developed and deployed in a secure and constantly monitored environment, they hold the risk of adversaries circumventing the safeguards and are not effective when the DM itself is publicly released. To solve the problem, we introduce NeMo, the first method to localize memorization of individual data samples down to the level of neurons in DMs' cross-attention layers. Through our experiments, we make the intriguing finding that in many cases, single neurons are responsible for memorizing particular training samples. By deactivating these memorization neurons, we can avoid the replication of training data at inference time, increase the diversity in the generated outputs, and mitigate the leakage of private and copyrighted data. In this way, our NeMo contributes to a more responsible deployment of DMs.

memorization, memorization neuron, neuron, (13 more...)

arXiv.org Artificial Intelligence

2406.02366

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Asia > North Korea (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Media (0.93)
Leisure & Entertainment (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SNP: Structured Neuron-level Pruning to Preserve Attention Scores

Shim, Kyunghwan, Yun, Jaewoong, Choi, Shinkook

arXiv.org Artificial IntelligenceApr-17-2024

Multi-head self-attention (MSA) is a key component of Vision Transformers (ViTs), which have achieved great success in various vision tasks. However, their high computational cost and memory footprint hinder their deployment on resource-constrained devices. Conventional pruning approaches can only compress and accelerate the MSA module using head pruning, although the head is not an atomic unit. To address this issue, we propose a novel graph-aware neuron-level pruning method, Structured Neuron-level Pruning (SNP). SNP prunes neurons with less informative attention scores and eliminates redundancy among heads. Specifically, it prunes graphically connected query and key layers having the least informative attention scores while preserving the overall attention scores. Value layers, which can be pruned independently, are pruned to eliminate inter-head redundancy. Our proposed method effectively compresses and accelerates Transformer-based models for both edge devices and server processors. For instance, the DeiT-Small with SNP runs 3.1$\times$ faster than the original model and achieves performance that is 21.94\% faster and 1.12\% higher than the DeiT-Tiny. Additionally, SNP combine successfully with conventional head or block pruning approaches. SNP with head pruning could compress the DeiT-Base by 80\% of the parameters and computational costs and achieve 3.85$\times$ faster inference speed on RTX3090 and 4.93$\times$ on Jetson Nano.

attention score, msa module, pruning, (13 more...)

arXiv.org Artificial Intelligence

2404.1163

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Divergent Token Metrics: Measuring degradation to prune away LLM components -- and optimize quantization

Deiseroth, Björn, Meuer, Max, Gritsch, Nikolas, Eichenberg, Constantin, Schramowski, Patrick, Aßenmacher, Matthias, Kersting, Kristian

arXiv.org Artificial IntelligenceNov-13-2023

Large Language Models (LLMs) have reshaped natural language processing with their impressive capabilities. Their ever-increasing size, however, raised concerns about their effective deployment and the need for LLM compressions. This study introduces the Divergent Token metrics (DTMs), a novel approach for assessing compressed LLMs, addressing the limitations of traditional perplexity or accuracy measures that fail to accurately reflect text generation quality. DTMs focus on token divergence, that allow deeper insights into the subtleties of model compression, i.p. when evaluating component's impacts individually. Utilizing the First Divergent Token metric (FDTM) in model sparsification reveals that a quarter of all attention components can be pruned beyond 90% on the Llama-2 model family, still keeping SOTA performance. For quantization FDTM suggests that over 80% of parameters can naively be transformed to int8 without special outlier management. These evaluations indicate the necessity of choosing appropriate compressions for parameters individually-and that FDTM can identify those-while standard metrics result in deteriorated outcomes.

mlp gate layer, module, value layer, (15 more...)

arXiv.org Artificial Intelligence

2311.01544

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback